home *** CD-ROM | disk | FTP | other *** search
Text File | 1989-03-20 | 74.7 KB | 2,667 lines |
- .ND
- .de B1
- .cm define away boxes
- ..
- .de B2
- .cm define away boxes
- ..
- .cm .nr PS 11
- .pl 10i
- .nr LL 6.3i
- .nr LT 6.3i
- .nr PO .8i
- .ds LH NNStat--Internet Statistics Package
- .ds CH
- .ds RH Braden & DeSchon
- .ds CF Page %
- .LP
- .nh
-
-
-
- .ce 10
- .LG
- .LG
- NNStat:
-
- Internet Statistics Collection Package
-
- -- Introduction and User Guide *
-
-
-
- .NL
- Robert T. Braden
- Annette L. DeSchon
-
- USC / Information Sciences Institute
- Marina del Rey, California
-
- November 28, 1988
-
-
-
- .in +0.4i
- .nr LL 5.9i
- .ll 5.9i
- .SM
- .ce
- .mc |
- RELEASE 2.2
- .ce 0
-
- This document describes the installation and operation of Release 2.2 of
- NNStat, a package of programs for the distributed collection of Internet
- traffic statistics.
-
- Release 2.2 differs from 2.1 in the following important ways:
- .IP *
- Adds new filter object eqf (EQ), equivalent to setf with one element but much
- more efficient.
- .IP *
- Supports either the SunOS 4.0 NIT interface (with appropriate bug fixes)
- or the original SunOS 3.x NIT interface.
- .IP *
- When the same object is attached with a new list of parameters, verifies that
- the new parameters are identical to the original set. In Release 2.1, the
- user could accidentally reuse the same object name with a different
- set of parameters; the original object would be used and the later parameters
- ignored, without any diagnostic message.
- .IP *
- Contains a number of internal efficiency improvements and some minor
- cleanups.
- .IP *
- Fixes a serious bug in the distributed Release 2.1, that caused
- matrix-sym objects to perform incorrectly.
- .IP *
- Fixes a bug in the display of percentages for working-set objects.
- .IP *
- Fixes a bug that allowed excessively long enumeration labels to
- clobber the stack.
- .LP
- .mc
-
- .FS *
- This work was supported by the National Science Foundation
- under Contract NCR-8718217.
- .FE
-
-
- .LP
- .nr LL 6.3i
- .ll 6.3i
- .in 0
- .bp
-
-
- .NH
- Introduction
- .LP
- NNStat is a set of programs that comprise a facility for the
- distributed collection of Internet traffic statistics. This facility
- is designed to support the requirements of a network administrator for
- gathering long-term usage statistics simultaneously
- at many network entry points.
- Although it is primarily intended for collecting
- long-term traffic statistics for administration,
- management, and topology engineering, NNStat is sufficiently
- general to be useful for operational problem solving.
-
- Distributed statistics collection has two aspects: acquisition of the
- primary data at multiple locations, and collection of all the acquired
- data into a single location.
-
- .IP (1)
- Distributed Data Acquisition
-
- The raw data must be acquired at a number of network/Internet points
- simultanously.
- In the NNStat model, there will be a
- .I
- statistics acquisition agent
- .R
- (SAA) process
- executing in a computer system
- attached to each network/Internet node for which data is required.
- The SAA machines could be packet switches, gateways,
- general-purpose hosts, or hosts dedicated to the
- acquisition function.
-
- .IP (2)
- Centralized Data Collection
-
- Data (or summaries of data)
- acquired by the SAA processes must
- be transmitted to a central site for analysis, reporting, and
- long-term storage. This central site, the
- .I
- statistics collection host
- .R
- (SCH), will run a data collection
- program to gather the data from the SAA processes.
- In many cases, a single locus for data collection is sufficient;
- however, it should be possible to have multiple SCH's simultaneously
- gathering data from the same set of acquisition agents.
- We may think of a primary SCH that serves as a
- central repository for usage data by a particular administration,
- with perhaps secondary collection hosts being used intermittently
- for short-term statistical studies.
- .LP
-
- The principal components of the NNStat package are an SAA program
- and an SCH program. The NNStat design
- is based upon the common use of Ethernets for interconnection of
- networks and Internet regions. NSFnet provides an example:
-
- .IP o
- Each component of NSFnet above the campus level (i.e.,
- the NSFnet Backbone and each of the middle-level networks)
- consists of a set of IP gateways connected by serial lines.
-
- .IP o
- Each gateway is also connected to an Ethernet that is used as the
- interconnect medium to one or more lower-level networks.
- We refer to this as an \fIinterconnect Ethernet\fP.
- .LP
-
- Figure 1 shows a typical configuration at one of the network nodes.
- The gateway G is a packet switch that forms part of the network
- under consideration. G1 and G2 are entrance gateways to the same
- or different lower-level networks.
-
- .cs R 18
- .ss 18
- .nf
- .in +0.8i
-
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Lower-level Network(s)
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ |
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ |
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ G1\ \ \ \ \ \ \ G2
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ |
- \ \ \ \ \ \ \ \ \ Interconnect\ |\ \ \ Ether|net
- \ \ \ \ \ \ \ \ |======.======.========.========.====|
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ __|__
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ G\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | SAA |
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ /\ \\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ |_____|
- \ \ \ \ \ \ \ \ \ \ \ \ \ /\ \ \ \\
- \ \ \ \ \ \ \ \ \ \ \ \ /\ \ \ \ \ \\
- \ \ \ \ \ \ \ Serial lines to other
- \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ network nodes
-
-
- Figure 1. Typical Network Node Configuration
-
-
- .cs R
- .fi
- .in 0
-
- Interconnect Ethernets provide convenient and appropriate
- points for gathering NSFnet statistics. They are convenient because
- an SAA executing on a host connected to one to these Ethernets
- can monitor the traffic in promiscous mode (see Figure 1).
- Thus, we can monitor the entrance and exit traffic
- without changing any gateway code.
- The interconnect
- Ethernets are also appropriate
- points for administrative statistics-gathering.
- Administrators and traffic planners are concerned mainly with
- packets entering and leaving the network; the fact that traffic
- between individual network routers cannot be monitored from the
- Ethernets is not a serious drawback.
-
- Implementing NNStat in an SAA host rather than in a
- gateway or packet switch had a number of advantages.
-
- .IP (1)
- Timeliness:
- The facilities provided by NNStat were needed quickly for NSFnet
- management.
- It will be some time before equivalent traffic measurement
- standards are developed and implemented by gateway vendors.
-
- .IP (2)
- Generality:
- We wanted to incorporate a degree of
- flexibility and generality into NNStat that is not currently available
- in gateways.
-
- .IP (3)
- Performance:
- Comprehensive statistics gathering require a non-trivial amount of
- CPU time and memory space; it is very undesirable to burden the current
- generation of gateways with this additional resource drain.
-
- .IP (4)
- Experimentation:
- By implementing this function outside gateways, we are free to experiment
- with different approaches; eventual incorporation of our results into
- gateways is a reasonable goal.
-
- .IP (5)
- Universality:
- There may not be a gateway at the point to be monitored; for example,
- there might be a link-level bridge.
- .LP
-
- The primary task of the SAA is to count the
- occurrences of packets with \*Qinteresting\*U configurations of values in
- their header fields. In the NNStat design, what is \*Qinteresting\*U is
- determined by the SAA configurations, which
- can be set or changed dynamically.
-
- Our model of NNStat operation within a particular network is
- as follows. The administration will set up the acquisition agents,
- one at each point from which data is desired, configured to collect a basic
- set of statistics. These statistics will be reported to the SCH to be
- summarized over sites, time, and perhaps administrative subsets of
- the networks.
- In addition, management and operational personnel will dynamically
- modify the SAA configurations from time to time, to answer
- additional statistical questions about the traffic.
- .LP
-
- Finally, we should mention some non-goals for the NNStat effort.
-
- .IP o
- NNStat does not provide fancy display or analysis programs
- for presenting the statistics. This is potentially a large and
- complex problem that is outside the scope of the NNStat effort.
-
- .IP o
- NNStat cannot gather statistics for traffic on the serial lines between
- IP routers; it can measure only the network entry and exit traffic.
- NNStat is intended to complement, not replace, the statistics gathering
- facilities built into gateways. For example, gateways typically count
- line errors and dropped packets on each of their physical interfaces, to
- monitor and diagnose line problems. These facilities are vital for
- operation and maintenance of the gateways and lines, forming the
- \*Qfirst line of defense\*U for problem diagnosis. However, NNStat is
- not generally concerned with short-term operational functions.
- .LP
-
- .NH
- Overview of NNStat
- .LP
-
- The NNStat package, which has been implemented for a 4.2/3BSD system,
- includes the following components:
-
- .IP (A)
- SAA Program --
- .B statspy
-
- The statistics acquisition agent program of NNStat is named
- .B statspy .
- .B Statspy
- has been implemented for a Sun workstation, and it
- must execute with
- \*Qsuperuser\*U privileges to access the NIT interface to the Ethernet.
- However, it could be ported to any other 4.2/3BSD system that provided
- an interface for promiscuous access to the Ethernet.
-
- Each Ethernet packet that
- .B statspy
- observes contains an Ethernet header
- followed by a sequence of one or more other protocol
- headers (e.g., IP, TCP, etc.), which
- reflect the successive encapsulation implied by protocol layering.
- Each protocol header may be considered to be a string of bits that is
- logically divided into substrings called
- .I fields.
-
- A particular
- .B statspy
- process can (and typically will) gather
- a number of different statistical
- measures of the packet traffic simultaneously.
- Each of these measures is gathered by a separate \fIstatistical object\fP,
- or simply \fIobject\fP.
- The set of objects and the selection of protocol fields that they monitor
- is determined by the
- .I configuration ,
- that can be set or changed while
- .B statspy
- is executing.
-
- .B Statspy
- is controlled by a command language that provides
- commands for setting and displaying the
- configuration and for displaying the statistical data gathered by its
- objects.
- .B Statspy
- commands may be entered from three locations:
-
- .RS
- .IP o
- From a file, at start-up time.
-
- This is the recommended way to set up the
- configuration for collecting long-term statistics, so
- .B statspy
- will be self-configuring if the SAA host crashes and restarts.
-
- .IP o
- Interactively, from the local console controlling statspy.
-
- This allows
- .B statspy
- to be used as a standalone
- monitoring tool.
-
- .IP o
- Interactively, from a remote system running the
- .B rspy
- program (see below).
- .RE
-
- Section 3 describes the command language, including the
- command used to set or modify
- the configuration. Appendix C suggests useful configuration techniques.
-
- If it is executed in foreground,
- .B statspy
- accepts
- commands and displays statistics locally. Whether in foreground or
- background, it listens for a
- TCP connection from the remote collection machine (SCH) or from a
- remote
- .B rspy
- program, and processes all
- commands entered over that TCP connection.
- However, the acquisition of new
- statistical data from the Ethernet takes highest
- priority.
-
- Note that
- .B statspy
- is not expected to record its data on a local disk;
- permanent data recording is assumed to take place only at the SCH.
- This choice was made to minimize operational problems at each SAA
- site.
-
- For more details on the operation and configuration of
- .B statspy ,
- see Section 3 below.
-
- .IP (B)
- Remote SAA Control Program --
- .B rspy
-
- The
- .B rspy
- program provides an interactive command interface for controlling a
- remote
- .B statspy
- instance.
- .B Rspy
- can be used to establish, query, or modify the
- configuration
- and to read and/or clear the statistical objects.
- The use of
- .B rspy
- is described in Section 3.4.
-
- .IP (C)
- Centralized Collection Program --
- .B collect
-
- .B Collect
- is the central data collection program of NNStat; it executes
- on the SCH to
- collect data from one or more
- .B statspy
- instances.
- .B Rspy
- and
- .B collect
- use the same remote network interface to
- statspy, but they are designed for different tasks:
- while
- .B rspy
- is intended to be used interactively for
- testing, probing, and running short-term statistical studies,
- .B collect
- is intended to be
- executed as a daemon, collecting and recording traffic
- data over a long period of
- time.
-
- In normal operation,
- .B collect
- will periodically poll a specified set
- of SAA's for statistical data and write the results into cumulative data files.
- Note that data is delivered to
- .B collect
- only as a result of its polling
- the SAA's. An alternative design would have the SAA's spontaneously
- report their data periodically to the SCH. We chose to use polling for
- data collection in order to ensure (approximate) synchronization in gathering
- statistics from all the SAA's, while avoiding an \*Qimplosion\*U of reports to
- at the central site.
-
- The following basic parameters must be defined to run
- .B collect :
-
- .RS
- .IP *
- List of SAA host names or addresses.
- .IP *
- TCP port for
- .B statspy
- on each SAA (optional).
- .IP *
- The name(s) of objects whose accumulated data are to be retrieved from each
- .B statspy
- instance.
- .IP *
- Polling interval Ti.
- .IP *
- Checkpoint interval Tc.
- .IP *
- Clear (\*Qreset\*U) interval Tr.
- .RE
-
- In one data collection cycle,
- .B collect
- will open a TCP connection to
- .B statspy
- on each of the listed hosts
- and retrieve data from (\*Qread\*U) the specified objects, recording the
- results in files.
- This cycle will be repeated every Ti minutes, but
- .B collect
- will save or \*Qcheckpoint\*U the data for later analysis
- only every Tc minutes.
-
- The totals returned by each poll are cumulative, unless the objects
- are explicitly cleared by command or the SAA (crashes and) restarts.
- Therefore,
- if communication between the SCH and an SAA is lost temporarily, a later
- successful poll should return complete data. The minimum polling interval
- Ti should be short enough that data lost because of a SAA restart
- will be negligible. Of course, if an SAA is down for an extended period,
- there is no way to capture statistics from that interconnect Ethernet
- for that period.
-
- .B Statspy
- generally keeps 32-bit counters for counting packet events.
- If the average rate were 1000 packets per second, some counters might
- overflow once every 4 weeks.
- .B Statspy
- makes no special provision for overflow, but instead expects that
- .B collect
- will be set up to
- periodically clear all the counters using the Tc parameter.
- Every Tc minutes,
- .B collect
- will instruct each
- .B statspy
- to clear its data counters
- after the current values are retrieved.
-
- Suggested values for the time parameters to
- .B collect
- are:
-
- Ti = 5 minutes
- Tc = 60 minutes
- Tr = 1440 minutes (24 hours).
-
- .B Collect
- will produce a separate data file for each (statistical measure, SAA
- host) pair, for all the statistical measures and hosts specified in its
- parameters. Each of these data files will contain the read data for
- every checkpoint time, plus the last data recorded before the
- .B statspy
- was restarted or its object(s) cleared, and will be
- cumulative from the time that
- .B collect
- program was started. If the SCH crashes or
- .B collect
- is restarted for some reason, a new set of data files will be created.
-
- Section 4 explains how to use
- .B collect .
-
-
- .IP (D)
- Data Reduction Programs
-
- The NNStat distribution includes some useful programs and AWK scripts for
- processing and summarizing the data files created by the
- .B collect
- program. These will be described in the Section 4.
- .LP
-
- .bp
- .NH
- Statspy
- .LP
-
- .NH 2
- Using Statspy
- .LP
-
- To execute \fBstatspy\fP, issue the following system command:
-
- .B1
- .nf
-
- \fBstatspy\fP [\fB\-i \fIinterface\fR] [\fB\-p \fIport\fR] [\fB\-h\fP] [\fB\-t \fItimeout\fR] [\fIcommand-file\fP]
-
- .fi
- .B2
-
- The parameters are:
-
- .IP \fB\-i\fP
- Ethernet interface device name; the default is ie0.
-
- .IP \fB\-p\fP
- TCP port number on which
- .B statspy
- will listen for a connection from
- .B collect
- or
- .B rspy .
- The default is 2222.
- .I
- At present, the only security mechanism in statspy is the use of
- a private TCP port number.
- .R
-
- .IP \fB\-h\fP
- Causes
- .B statspy
- to write a history of remote commands into the standard output.
-
- .IP \fB\-t\fP
- Specifies watchdog timeout value in seconds.
-
- During a remote operation,
- .B statspy
- sets a watchdog timer to detect
- a failure of the
- connection or remote
- .B collect
- program. The \fB\-t\fP parameter may be used to
- override the default timeout of 120 seconds.
-
- .IP \fIcommand-file\fP
- This optional parameter is the name of a file containing
- commands to be executed when
- .B statspy
- starts.
- Normally, these will be commands to establish the
- initial configuration of objects for gathering data.
- If this parameter is omitted,
- .B statspy
- will await commands from the local console
- or from
- .B rspy
- (executing either on the SAA host or remotely)
- to establish the configuration.
- .LP
-
- When it starts,
- .B statspy
- executes the commands found in \fIcommand-file\fP, if any. If it has
- been executed in foreground,
- .B statspy
- then enters an interactive command mode
- in which it repeatedly issues a prompt (\*Q>\*U) and awaits command input.
- If
- .B statspy
- is executed in background, its standard output and
- standard error output should be directed to a file to aid diagnosis
- in case a problem occurs.
-
- .B Statspy
- listens on the specified TCP port for a connection from a
- remote
- .B collect
- or
- .B rspy
- program. It is currently limited to
- one TCP connection at a time, so the TCP connection is opened for
- each sequence of remote commands
- and closed again when the responses have been returned.
-
- .NH 2
- Statspy Command Language
- .LP
-
- The operation and configuration of
- .B statspy
- are controlled by a simple
- command language. Commands to
- .B statspy
- can be entered from three
- sources:
-
- .IP (1)
- the initial command file (see preceding section);
-
- .IP (2)
- interactively from the controlling console (i.e., from
- standard input);
-
- .IP (3)
- remotely from a
- .B collect
- or
- .B rspy
- program.
- .LP
-
- Remote command requests have priority over local commands, while
- processing new data from the Ethernet generally preempts either local or
- remote
- command processing.
-
- Commands from any source are free-form and may occupy as many lines as
- necessary. Any text following the \*Q#\*U character and up to the
- next newline will be ignored, to allow comments in the
- command stream.
-
- Various
- .B statspy
- commands reference objects and protocol fields by name.
- The field names are built into the program (see Figure 2 in Section 3.3),
- while object names are
- assigned by configuration commands.
- There are commands
- to return a complete lists of the
- names of objects or fields (\*Qread ?\*U or \*Qshow ?\*U, respectively).
-
- Commands refer to particular objects by their names. They can refer to
- a set of objects by using a \*Qwildcard\*U matching scheme. An
- object specification parameter, known as an \fIobject spec\fP,
- may contain asterisks as wildcard characters
- to match any number of characters.
- For example, the command:
-
- read *IP*
-
- will apply the read operation to all objects whose names include the
- string \*QIP\*U, and
-
- read *
-
- will read all objects.
- In setting up the configuration, the user should choose a consistent scheme
- for assigning object names to increase the
- usefulness of this wildcard matching.
-
- As we will see in the next section, some objects do not themselves gather
- data, but instead conditionally select other objects that do.
- Such conditional objects can be left unnamed, since they
- will generally not be referenced by a command after they are created.
- Commands differ in how they treat such unnamed objects
- (see below).
-
- We now list all the commands recognized by
- .B statspy.
-
- .IP o
- read <object spec>
-
- Displays the data recorded by the object(s) whose names
- match <object\ spec>.
- Unnamed objects cannot be the target of a read operation.
-
- .IP o
- read ?
-
- Displays a summary, which includes the names of all objects.
- Unnamed objects will be included in this summary.
-
- .IP o
- clear <object spec>
-
- Sets all objects whose names match <objec\ spec> to their initial
- states, i.e., clears their statistical accumulation.
- \*QClear *\*U will clear all objects, including
- unnamed objects.
-
- .IP o
- readclear <object spec>
-
- Executes a read followed by a clear operation, atomically.
- \*QReadclear *\*U will clear but not read all unnamed objects.
-
- .IP o
- show *
-
- Displays a summary of the current configuration.
-
- .IP o
- show ?
-
- Displays a list of the built-in field names.
-
- .IP o
- attach { <configuration program> }
-
- Augments the current configuration with the additional statistical object(s)
- specified by <configuration program>. The curly braces are required.
- The <configuration program> is written using a set of rules
- that we will refer to as the
- .I configuration language,
- although
- it is really a sub-language of the command
- language; see Section 3 for details.
-
- The \fIattach\fP command is atomic -- if any error is found,
- the current configuration will remain unchanged.
-
- .IP o
- detach <object spec>
-
- Deletes from the configuration each object whose name matches <object\ spec>.
- This may implicitly delete other objects in order to
- keep the configuration consistent.
- \*Qdetach *\*U will detach all objects, including those that have no names.
-
- .IP o
- ?
-
- Displays a list of the commands. This command is only available on the local
- console.
-
- .IP o
- quit
-
- Exits to the operating system (shell) on the SAA host. This command cannot
- be issued across the network.
-
- .IP o
- enum { <enum parameters> }
-
- Defines a set of label strings for use in \fIread\fP command displays. See
- Section 3.3.4 for more explanation.
- The curly braces are required.
- .LP
-
- All of these commands may be entered remotely from
- .B rspy
- or locally.
- The
- .B collect
- program effectively issues the \fIread\fP and \fIreadclear\fP commands.
-
- When a \*Qshow ?\*U command is issued to
- .B statspy,
- the first line displayed summarizes the overall packet processing since
- .B statspy
- was started. For example:
-
- .SM
- Acquired 56343 packets in 163 secs=> 345(avg) 755(max) 1250(inst)/sec
- .NL
-
- This shows the total Ethernet
- packets acquired, the elapsed time since
- .B statspy
- started, the average packets
- per second, the maximum number of packets processed in one second, and
- finally the maximum \*Qinstantaneous\*U packet rate. The last is obtained
- by extrapolating to one second the maximum number of packets captured
- in one clock tick (20ms on the Sun workstation).
-
- .NH 2
- Configuring Statspy
- .LP
-
- We divide the extraction of statistical data from a particular
- Ethernet packet into two phases:
-
- .IP (1)
- Parse the protocol headers
- to determine the values of
- the various header fields.
-
- Since efficiency is essential and packet header formats do
- not change very often, the header formats are compiled into the
- .B statspy
- code.
- Each incoming Ethernet packet is passed to a subroutine
- that \*Qknows\*U how to parse all the headers and where to locate
- the fields. To add new protocols or change header formats,
- it will be necessary
- to recompile this packet-parsing subroutine of
- .B statspy.
-
- .IP (2)
- Analyze the parsed field values and gather the
- desired statistics.
-
- This phase is performed interpretively, using a set of rules that
- comprises the
- .B statspy
- configuration.
- .LP
-
- .NH 3
- Fields
- .LP
-
- Figure 2 shows a list of the fields that will be extracted by
- .B statspy
- and made available to the analysis phase.
- A particular packet will define values for only a subset of
- the possible fields; for example, a TCP packet will define the TCP source
- and destination ports but cannot define UDP ports or an ICMP type field.
-
- As Figure 2 shows, each field is assigned a mnemonic
- name string, a size in bytes,
- and an intrinsic type. The type is used principally to choose an appropriate
- format for displaying the data values from that field.
- Each field is extracted into an integral number of
- 8-bit bytes. Thus, the IP version number (field \*QIP.version\*U)
- is actually 4 bits but is extracted (right-justified) by the parser
- into a byte.
-
- .KF
- .nf
- .RS
-
- .B1
- .nf
-
- .ta 1.8iC 3.0i
- Field Name Length(bytes) Type
- .ta 1.8iR 3.0i
-
- Ether.src 6 Ethernet Address
- Ether.dst 6 Ethernet Address
- Ether.type 2 Integer
-
- IP.version * 1 Integer
- IP.length 2 Integer
- IP.option * 1 Integer
- IP.TOS 1 Bits
- IP.offset * 2 Integer
- IP.protocol 1 Integer
-
- IP.srchost 4 IP Address
- IP.dsthost 4 IP Address
- IP.srcnet * 4 IP Address
- IP.dstnet * 4 IP Address
-
- TCP.srcport 4 Integer
- TCP.dstport 4 Integer
- UDP.srcport 4 Integer
- UDP.dstport 4 Integer
- ICMP.type 1 Integer
-
- .ta 1.8iC 3.0i
- packet * Variable Bits
-
-
- Figure 2. Field Definitions in Packet Parser
-
- .B2
-
- .fi
- .KE
- This list includes virtual fields whose values are derived from those
- actually appearing in the header; these
- are marked with \*Q*\*U in Figure 2.
- The virtual fields have
- the following meanings:
-
- .IP (a)
- IP.option
-
- This is the code byte for each IP option field found in the packet,
- or zero if there are no options. Note that a single packet may
- contain several options, so this pseudo-field may be multiply defined.
-
- .IP (b)
- IP.srcnet, IP.dstnet
-
- These are the (Class A, B, or C) network numbers derived from the
- real IP source and destination address fields, respectively.
- These virtual fields provide a simple and efficient way to develop
- statistics based upon networks rather than hosts.
-
- .IP (c)
- IP.version
-
- This virtual field containing the IP version number
- is extracted by
- .B statspy
- for analysis \fIonly\fP for a packet with a non-standard
- IP version number (i.e., not 4). No later fields
- (IP, TCP, or UDP) can or will be extracted from the same packet.
- .bp
-
- .IP (d)
- IP.offset
-
- This virtual field is extracted
- (we say \*Qdefined\*U) by
- .B statspy
- only for a packet that is a fragment
- of a complete IP datagram.
- When it is defined, IP.offset is the
- ressembly offset of this fragment in bytes (i.e., 8 times the offset
- field in the IP header).
-
- Only the first fragment, i.e., the fragment at offset zero, can be
- parsed further for higher-level protocol headers (TCP, UDP, or ICMP).
- We made the reasonable assumption that these headers will always fit
- within the first fragment, i.e, that the first fragment will always
- be larger than about 90 bytes (unless it is also the last fragment).
-
- Note that for a fragmented packet the IP.length, IP.option, and
- IP.TOS fields are defined for each fragment separately. Thus, IP.length
- gives the length of the fragment;
- .B statspy
- cannot determine the total length the reassembled IP datagram.
-
- .IP (e)
- packet
-
- This virtual field contains the binary value of the header sequence.
- It is intended for recording particular packet headers for diagnostic
- rather than statistical purposes.
- .LP
-
- .NH 3
- Objects and Invocations
- .LP
-
- Statistical analysis of field values is performed by a set of
- .B statspy
- entities known as (statistical)
- .I objects.
- NNStat implements unary and binary objects, i.e., objects that take one
- and two input values. Each object may have a unique name that
- is assigned when the object is defined.
-
- Objects
- are logically independent of fields; objects
- simply build and report statistical data structures based on
- (field) values written
- into them.
- The analysis phase is essentially a series of calls on object
- .I Write
- subroutines; in each call,
- a particular field value (or pair of field values) is
- passed as a parameter.
- These Write subroutine calls are known as
- .I invocations.
-
- For example, the configuration might specify that an object named
- \*QProtocol.freq\*U is to be invoked on the field named
- \*QIP.protocol\*U; that is, the value of field \*QIP.protocol\*U will be
- written into object \*QProtocol.freq\*U. The configuration may specify
- that the same field is to invoke more than one object. Conversely, the
- same object may be invoked on more than one field, to build a composite
- statistic. Fields that invoke the same object must be compatible, i.e.,
- they must have the same size and type (see Figure 2 for the types).
-
- Each object is an instance of a particular object class; all objects of
- the same class share the same program modules but each has its private
- data structure. The
- .B statspy
- object classes generally fall into two categories: recorders and
- filters.
-
- .IP (A)
- Recorders
-
- A data recorder object or
- .I recorder
- builds some statistical data structure (e.g., a
- frequency distribution table) from the field values with which it is invoked.
- .RS
-
- Example: \fIfreq-all\fP
-
- An object
- of class \fIfreq-all\fP builds a frequency distribution table for all
- distinct values of the field on which it is invoked.
- .RE
-
- Figure 3 shows
- an example of the display output resulting from a read operation
- on a \fIfreq-all\fP object named \*Qgwys\*U. The field values recorded in
- this object are 48-bit Ethernet addresses.
-
- .KS
- .B1
- .nf
-
- OBJECT: gwys Class= freq-all [CreationTime: 11:51:25 11-05-87]
- ReadTime: 11:52:18 11-05-87,
- ClearTime: 11:51:25 11-05-87 (@ -53 secs)
- Total Count= 492 (+0 orphans)
- #bins= 8
- [2:7:1:0:8:30]= 219 (45%) @- 1secs
- [8:0:2:0:49:30]= 127 (26%) @- 1secs
- [2:60:8c:ee:2:34]= 52 (11%) @- 1secs
- [24:24:80:9:0:6b]= 44 (8.9%) @- 1secs
- [8:0:14:10:12:8]= 27 (5.5%) @- 1secs
- [8:0:2:0:f7:2b]= 20 (4.1%) @- 2secs
- [aa:0:3:1:5:90]= 2 (0.41%) @- 13secs
- [2:7:1:0:4:45]= 1 (0.2%) @- 32secs
-
-
- Figure 3. Example of Read Output
-
- .fi
- .B2
- .KE
-
- Each of the bottom 8 lines displays the contents of one bin:
- the value (6 bytes in hex),
- the count, the percentage count,
- and the last-update time relative to the current time (\*QReadTime\*U).
-
- .IP (B)
- Filters
-
- Data filter objects or
- .I filters
- provide conditional branches in the
- configuration.
-
- When invoked with a field value, a filter tests it against some
- numerical or set-inclusion criterion, to determine a Boolean (True/False)
- value. This Boolean value is then used by the \fBstatspy\fP interpreter
- to select one of two alternative sub-sequences of invocations, where either
- of these sub-sequences may be empty.
-
- .mc |
- .RS
- Example: \fIeqf\fP
-
- An object of class \fIeqf\fP tests a field value for
- equality to a parameter
- value that is specified when the object is created.
- .RE
-
- .mc
- By nesting filter invocations in a configuration, record invocations can
- be conditioned upon an arbitrary Boolean expression over field values.
-
- Figure 4 shows an example of (a fragment of) a pseudo-program,
- in flow-chart form. This
- sequence of invocations was designed to answer the question: \*Qwhat are
- the Ethernet addresses of hosts sending or receiving transit packets,
- i.e., of IP packets that neither originate or terminate on the local
- Ethernet?\*U.
- .KF
- .nf
- .ta 0.5i 2.5i
- .lc _
-
- ____________________________
- | Invoke \fIeqf\fP filter object |
- | with parm (128.9.0.0) |
- | on field \*QIP.srcnet\*U |
- |____________________________|
- .ta 1i 2i
- | |
- | TRUE | FALSE
- V |
- (Null) |
- |
- V
- .ta 0.75i 2.75i
- _____________________________
- | Invoke \fIeqf\fP filter object |
- | with parm (128.9.0.0) |
- | on field \*QIP.dstnet\*U |
- |____________________________|
- .ta 1.25i 2.25i
- | |
- | TRUE | FALSE
- V |
- \ \ \ \ \ \ (Null) |
- |
- V
- .ta 1.0i 3.0i
- __________________________
- | Invoke \fIfreq-all\fP recorder |
- | object named \*Qgwys\*U |
- | on field \*QEther.src\*U |
- |__________________________|
- .ta 2.0i
- |
- |
- V
- .ta 1.0i 3.0i
- __________________________
- | Invoke \fIfreq-all\fP recorder |
- | object named \*Qgwys\*U |
- | on field \*QEther.dst\*U |
- |__________________________|
-
-
- Figure 4. Example Pseudo-program
-
-
- .fi
- .KE
-
- Figure 4 includes two invocations
- of an unnamed \fIeqf\fP filter object whose parameter is the value
- 128.9.0.0 (the IP address of the local Ethernet).
- Thus, the first invocation shown in Figure 4 will return
- TRUE if the IP source network number in field \*QIP.srcnet\*U
- is 128.9.0.0, and FALSE otherwise.
-
- .NH 3
- Configuration Language
- .LP
-
- We begin with an example of the configuration (sub-)language for
- .B statspy.
- The configuration of Figure 4 could be created by entering an
- \fIattach\fP command with the <configuration program> shown in Figure 5.
-
- .mc |
- .KS
- .nf
- .RS
- .B1
-
- if IP.srcnet is eqf(128.9.0.0) {
- if IP.dstnet is eqf(128.9.0.0) {
- record Ether.src in gwys freq-all;
- record Ether.dst in gwys;
- }
- }
-
- .fi
- .RE
- Figure 5. Configuration program that compiles into Figure 4.
-
- .B2
- .KE
-
- .mc
- Figure 5 illustrates several points about the configuration language:
-
- .IP 1.
- The language is free-field with newlines having no meaning. Hence, we
- can indent to illuminate the structure of the program.
-
- .IP 2.
- The language includes compound statements and \*Qif\*U statements; the latter
- correspond to invocations of filter objects.
-
- .IP 3.
- The first time a named object occurs, its class (and parameters, if any)
- must be specified. They may be omitted in later references to the same
- object.
- .mc |
- If they are included, their values must exactly match the parameters
- specified in the first occurrence of the same object.
- .mc
-
- .IP 4.
- Parameters, when required, are enclosed in parentheses
- following the class name. If there is more than one parameter, they
- are listed separated by commas.
-
- .IP 5.
- An unnamed object may be created, by giving only its class (and parameter,
- if any). Filter objects are often left unnamed, since there is
- usually no need to read them.
-
- Note: class names are reserved, and not coincide with object names.
- The valid class names are all listed in Appendix A.
- .LP
-
- Two more things about the language are not apparent from this
- example:
-
- .IP 6.
- The outer set of
- braces in Figure 4 is unnecessary. The syntax of the
- configuration language is generally like \*QC\*U.
-
- .IP 7.
- Specific configuration rules are triggered only when the fields upon
- which they depend are defined in a packet.
-
- Thus, in Figure 4 it was not necessary to explicitly test that the packet
- in question was an IP packet; if it were not an IP packet, then the
- IP.srcnet and IP.dstnet fields would not be defined.
-
- Furthermore,
- .B statspy
- checks and enforces consistency among the fields, so that the
- configuration cannot include invocations that are logically impossible.
- An example of such an illegal configuration is:
-
- .mc |
- if TCP.srcport is eqf(23)
- IF UDP.dstport is eqf(6)
- record Ether.src in Imposs-obj freq-all;
- .mc
-
- This configuration is illegal because TCP.srcport and UDP.dstport cannot
- be defined in the same packet, hence the \*QImposs-obj\*U recorder
- would never be invoked.
- .LP
-
- We now list general syntax rules for the configuration language.
- Appendix B specifies the exact syntax of the
- configuration language, using BNF.
-
- .IP (1)
- RECORDER OBJECT:
-
- To create a (unary) recorder object that is invoked on a specified field, use
- the following statement:
-
- \fBrecord\fP <field name> \fBin\fP <object name>
-
- <class> \fB(\fP <parameters> \fB) ;\fP
-
- For a binary object, two fields are required:
-
- \fBrecord\fP <field name>\fB,\fP <field name> \fBin\fP <object name>
-
- <class> \fB(\fP <parameters> \fB) ; \fP
-
-
- In either case, <class> must match the name of a valid recorder class.
- The current set of <class> names and corresponding <parameters>
- are defined in Appendix A.
-
- Normally, every recorder object ought to have a unique <object name>,
- so that it can be read, cleared, and/or detached independently
- of other objects. However, a recorder object may be created
- with a null <object name>. Such an object can be destroyed (detached) or
- cleared, but not read.
-
- Note the semicolons following record statements; these are
- required.
-
- .IP (2)
- FILTER OBJECTS:
-
- To create a filter object to be invoked by a specified field, use
- a conditional statement. This takes the form of
- a \fIfilter clause\fP:
-
- \fBif\fP <field name> \fBis\fP <object name>
-
- <class> \fB(\fP <parameters> \fB)\fP
-
- followed by either:
-
- <TRUE invocation>
-
- or:
-
- <TRUE invocation> else <FALSE invocation>
- .sp
-
- Here, <field name> <class> must match the name of a valid recorder class.
-
- <TRUE invocation> and <FALSE invocation> may themselves be recorder
- or filter invocations, or may be arbitrary sublists of invocations, grouped
- together inside braces \*Q{ }\*U. These sublists may themselves include filter
- invocations, and this nesting can go to any depth.
-
- The sense of the filter clause may be reversed by specifying "isnot"
- instead of "is".
- .LP
-
- The <parameters> string generally specifies a list of one or more values
- separated by commas. The number and meaning of these values depend
- upon the particular class (see Appendix A for details). The
- <parameters> string and the surrounding parentheses may be omitted if the
- particular class does not require parameters or if the specified object
- has been defined previously with parameters.
-
- .mc |
- If the specified <object name> already exists, a new invocation
- specification refers to the same object. In this case, <class> may
- be omitted, but if it is respecified it must agree with the class
- of the existing object. If <class> is respecified, then
- \*Q\ (\ <parameters>\ )\*U may also be respecified, but its values
- must agree with the parameters given for the existing object.
-
- .NH 3
- Attach Error Messages
- .LP
-
- This section lists the error messages that may occur in processing an
- attach command.
-
- .IP *
- ATTACH error -- Bad field name: <field name>
-
- The specified string is not the name of any defined field. The valid
- field names can be obtained at any time using "show ?".
- .IP *
- ATTACH error -- Class Conflict for: <object name>
-
- Two invocations of the same object specify conflicting class names.
- .IP *
- ATTACH error -- Parm list conflict for: <object name>
-
- Two invocations of the same object specify conflicting parameter lists.
- .IP *
- ATTACH error -- Unknown class for new object: <object name>
-
- In the first invocation of an object, no class has been specified.
- .IP *
- ATTACH error -- Conflicting data type: <object name>
-
- The same object is being invoked on different fields thathave different
- types and are therefore incompatible.
- .IP *
- ATTACH error -- Conflicting field size: <object name>
-
- The same object is being invoked on different fields that have different
- lengths and are therefore incompatible.
- .IP *
- ATTACH error -- Cannot start with <input text>
-
- Syntax error.
- .IP *
- ATTACH error -- Syntax error at <input text>
- .IP *
- ATTACH error -- No matching enum for <text>
-
- Unable to find matching enum string for symbolic parameter value.
- .IP *
- ATTACH error -- Unknown name: <string>
-
- Unknown host domain name used as a parameter value.
- .RE
-
- .mc
- .NH 3
- Enumerations
- .LP
-
- Some packet header fields (e.g., the IP protocol number)
- may be characterized as \*Qenumerations\*U,
- meaning that there is a discrete set of possible values.
- It is helpful to humans viewing the output of a \fIread\fP command to have
- appropriate mnemonic labels attached to the values of enumeration fields.
- The \fIenum\fP command may be used to define such mnemonic label strings.
-
- The \fIenum\fP command has only local effect;
- \fIEnum\fP commands taken from the
- .B statspy
- configuration file or entered
- locally on the
- .B statspy
- console control only the formatting of
- local read commands. Similarly, an \fIenum\fP
- command entered in
- .B rspy
- is used locally at
- .B rspy
- for formatting
- read results; it is not transmitted across the network to
- .B statspy .
-
- The \fIenum\fP command has the form:
-
- \fBenum {\fP <enum parameters> \fB}\fP
-
- The general form of <enum parameters> is:
-
- .DS
- <object spec> ( <value> <label>, ... , <value> <label> ),
-
- ...
-
- <object spec> ( <value> <label>, ... , <value> <label> )
-
- .DE
-
- That is, it generally specifies a list of object name specs,
- and for each
- a sub-list of (label, value) pairs.
-
- Here is a possible \fIenum\fP command parameter that defines label strings
- for objects attached to the IP protocol field:
-
- .KS
- *IP.proto* (1 \*QICMP\*U, 3 \*QGGP\*U, 6 \*QTCP\*U, 8 \*QEGP\*U,
-
- 12 \*QPUP\*U, 17 \*QUDP\*U, 20 \*QHMP\*U, 21 \*QXNS-IDP\*U,
-
- 27 \*QRDP\*U, 77 \*QND\*U)
- .KE
-
- The string \*Q*IP.proto*\*U is an <object spec>, indicating that this
- list of labels will be used in formatting a read operation for any object
- whose name contains the embedded string \*QIP.proto\*U.
-
- Note that each sublist is keyed to an <object spec>, not an object name
- or a field name. When the results of reading a specific object are
- formatted for display, the <object name> of that object is matched
- against each <object spec> that has appeared in any \fIenum\fP command;
- the first match causes the corresponding set of (label, value) pairs to
- be used. Careful choice of object names is necessary to take advantage
- of this wildcard matching mechanism.
-
- The <label> elements may be surrounded with quotation marks (\*Q\*U), and
- must be if they contain embedded blanks or other special
- .mc |
- characters. A label surrounded with quotation marks may contain any
- printable characters except: comma, linefeed ("\\n"), or
- quotation marks themselves.
- .mc
-
- The effect of \fIenum\fP commands is cumulative. There is no command to
- delete an enumeration; however, new label definitions will override
- previous definitions for the same (object-spec, value) pairing.
- .LP
-
- .NH 2
- Remote Control of Statspy
- .LP
-
- The
- .B rspy
- program may be used to enter commands remotely to a running
- .B statspy
- program. The command to execute
- .B rspy
- is:
-
-
- .B1
-
- \fBrspy\fP [\fB\-p \fIport\fR] [\fB\-h \fIhost\fR] [\fIcommand-file\fP]
-
- .B2
-
-
- Here the parameters are:
-
- .IP \fB\-p\fP
- TCP port number on which
- .B statspy
- is listening.
- The default is 2222.
-
- .IP \fB\-h\fP
- The name or dotted-decimal IP address of the
- .B statspy
- host. The default is the local host.
-
- .IP \fIcommand-file\fP
- The optional name of a file containing a script of commands to be executed when
- .B rspy
- begins.
- .LP
-
- .B Rspy
- will then prompt for input (\*QRspy>\*U), accepting
- new commands from standard input and writing any output to standard
- output. Unix*
- .FS
- * UNIX is a trademark of AT&T Bell Laboratories.
- .FE
- pipes can be used for input commands and/or output results.
-
- The commands to
- .B rspy
- are those listed in Section 3.2 for
- .B statspy ,
- plus one additional command peculiar to
- .B rspy :
-
- .IP o
- \fIhost\fP <IP address>
-
- Overrides the -h parameter
- to specify the remote host to which following commands
- will be directed.
- Here <IP address> may be a host domain name or a dotted-decimal IP address.
- .LP
-
- Generally, commands entered to
- .B rspy
- are sent to
- .B statspy
- on the
- remote host. However, the \fI?\fP, \fIenum\fP, \fIhost\fP, and \fIquit\fP
- commands are executed locally by \fBrspy\fP.
-
- .bp
- .NH
- Data Collection
- .LP
-
- .NH 2
- Using Collect
- .LP
-
- .B Collect
- is executed as:
-
-
- .B1
- .nf
-
- \fBcollect\fP [\fB\-e \fIenumfile\fR] [\fB\-h \fIhost1\fR] ... [\fB\-h \fIhostn\fR] [\fB\-p \fIport\fR]
-
- [\fB\-i \fImin\fR] [\fB\-c \fImin\fR] [\fB\-r \fImin\fR] [\fB\-d\fP|\fB\-dl\fP|\fB\-dx\fP] \fIobject-spec\fP
-
- .fi
- .B2
-
- These parameters are:
-
- .nf
- .IP \fIobject-spec\fP
- The objects from which data is to be collected. \fIobject-spec\fP may
- contain the \*Qwild-card\*U character \*Q*\*U.
- \fIobject-spec\fP is a mandatory parameter, with no default.
-
- .IP \fB\-e\fP
- Name of a file containing <enum parameters>
- (see Section 3.3.4) for labeling results. Default is no enum file.
-
- .IP \fB-p\fP
- TCP port to connect to; default is 2222.
-
- .IP \fB-h\fP
- An SAA host from which data is being collected. A series of \fB-h\fP
- parameters may appear, to define a list of SAA hosts. Parameter may be a
- dotted decimal IP address or a domain name. Default is the local host.
-
- .IP \fB-i\fP
- Polling interval Ti in minutes. Default is 0, causing \fBcollect\fP to
- run once.
-
- .IP \fB-c\fP
- Checkpoint interval Tc in minutes. Default is 0, causing only the latest
- data to be saved.
-
- .IP \fB-r\fP
- Clear interval Tr in minutes. This
- is the interval at which \fBcollect\fP sends \fIreadclear\fP
- instead of \fIread\fP command to the hosts. Default is zero, causing
- no clearing to take place.
-
- .IP \fB-d\fP
- Direct trace & log to stdout.
- .IP \fB-dl\fP
- Direct trace to stdout, log to files.
- .IP \fB-dx\fP
- Direct trace & hex dump to stdout.
-
- Default is no trace, log directed to files.
- .LP
-
- The time parameters used by the
- .B collect
- program must be entered in minutes; this unit was chosen
- both for convenience
- and to avoid giving the user a false sense of precision.
-
- An example of typical parameters that one might use to run
- .B collect
- from a \*QC shell\*U is as follows:
-
- collect '*' -h 35.1.1.21 -i 5 -c 60 -r 1440 >& errors.log &
- .fi
-
- In this example:
-
- .IP o
- The read interval Ti is 5 minutes, the checkpoint interval Tc is 60 minutes,
- and the clear interval Tr is 1440 minutes (24 hours).
-
- .IP o
- The '*' object-spec tells
- .B collect
- to read and save
- statistics from all objects.
-
- .IP o
- The -h 35.1.1.21 parameter specifies the host on which
- .B statspy
- is executing.
-
- .IP o
- The \*Q>& errors.log\*U redirects any error reports to the file
- \*Qerrors.log\*U.
- .IP o
- The final \*Q&\*U starts the collection of statistics
- in background mode, so that it is unaffected by other use of
- the shell and logouts. It can be stopped via use of the
- \*Qkill\*U command.
- .LP
-
- .NH 2
- Collect Log Files
- .LP
-
- .B Collect
- saves statistics in files whose names are formed from the
- .B statspy
- host name, the object name, and the time that
- .B collect
- is started.
- For example, the statistics file named:
-
- \*Q35.1.1.21-gwys.1214.1540\*U
-
- was
- created at 1540 (local time for
- .B collect
- ) on December 14th, and contains
- statistics from the object named \*Qgwys\*U on
- .B statspy
- host \*Q35.1.1.21\*U.
-
- Figure 3 shows an example of an individual
- .B collect
- log entry, which has the same format as a local display of the read data.
- Each log entry contains three timestamps:
- .IP o
- \*QCreationTime\*U is the time that the
- .B statspy
- module was started.
- .IP o
- \*QReadTime\*U is the time associated with the data in the current log entry.
- .IP o
- \*QClearTime\*U is the last time that this object was cleared. If the
- object has never been explicitly cleared, the ClearTime is the same as
- the CreationTime.
- .LP
-
- The
- .B statspy
- module sends these timestamps
- in universal (UNIX) time, and
- .B collect
- converts them to its
- local time when they are formatted and written.
-
- Not all of the data that
- .B collect
- receives from
- .B statspy
- are saved permanently in log files. Data must be saved in two
- situations:
-
- .IP (1)
- The time between
- checkpoints has elapsed.
-
- The \*Qcheckpoint\*U
- parameter is typically used to provide a statistical breakdown of traffic
- by time of day, e.g.,
- the number of packets received during each hour of the day.
-
- .IP (2)
- The
- .B statspy
- object has
- been cleared.
-
- The
- .B statspy
- object may have been cleared intentionally by a clear command or
- unintentionally by a crash and restart of the SAA machine. The
- .B collect
- program polls each
- .B statspy
- every Ti minutes, which should be short enough to minimize the
- statistical loss due to SAA crashes.
- .LP
-
- It is also possible that the SCH running
- .B collect
- will itself crash. Therefore,
- .B collect
- always writes new data into the log file, but it may either overwrite the
- previous log entry or append to the end of the file, saving the
- previous entry.
-
- In order to decide whether a particular entry should be
- saved,
- .B collect
- keeps track of the
- last \*QClearTime\*U received and the next time that a checkpoint
- log entry should be saved.
- Specifically,
- .B collect
- will overwrite the previous entry unless:
- .IP a.
- The previous entry is the very first entry appearing in
- the log file, or
- .IP b.
- The previous entry was received after/at the time that
- a checkpoint was required, or
- .IP c.
- The ClearTime on the current entry is different from
- the ClearTime on the previous entry.
- .LP
-
- .NH 2
- Data Reduction Programs
- .LP
-
- The NNStat package includes several programs to process log files
- produced by the
- .B collect
- program.
- The
- .B lookupnames
- program scans through log files, outputting the
- original text of the log file with the appropriate domain name added
- following each instance of a host number or a network number.
- Shell scripts that invoke AWK programs have also been included.
- These scripts may be installed as command aliases named
- .B count-totals
- and
- .B bin-totals .
- These commands are intended to provide
- part of the data reduction capability needed to produce traffic statistics.
-
- .NH 3
- Lookupnames Program
- .LP
-
- The
- .B lookupnames
- program may be used to scan log files for embedded host and
- network numbers, map these numbers into corresponding names, and create a
- new file with the names inserted immediately after the corresponding numbers.
-
- The following is an example of output produced by
- .B lookupnames:
-
- .KS
- [35.0.0.0 MERIT:35.0.0.0 MERIT]= 112665 (93.9%) @-0sec
- [35.0.0.0 MERIT:128.116.0.0 USAN]= 387 ( 0.3%) @-36sec
- [128.116.0.0 USAN:35.0.0.0 MERIT]= 462 ( 0.4%) @-35sec
- [35.0.0.0 MERIT:128.182.0.0 PSCNET]= 388 ( 0.3%) @-46sec
- [128.182.0.0 PSCNET:35.0.0.0 MERIT]= 345 ( 0.3%) @-45sec
- .KE
-
-
- The number-to-name conversion is performed using the Domain Name
- system, or if
- no matching entry is returned, by a local file of network names. A standard
- hosts.txt file may also be used to supply the network names.
- If no host/network name is found in either database, the string
- \*Q(UNKNOWN-HOST)\*U is displayed in place of the name.
-
- The usage is:
-
- .B1
-
- \fBlookupnames\fP [\fB\-n \fIfilename\fR] [\fB\-t \fIseconds\fR] \fIinput-file-list\fP
-
- .B2
-
- The input files, concatenated and augmented with the name strings, are
- written to standard output.
-
- The optional command line flags are as follows:
-
- .IP \fB\-n\fP
- The name of a networks file. If this parameter is unspecified, the
- program looks for a file named:
- \*Qnetworks.txt\*U.
- This file should contain the networks
- portion of a standard \*Qhosts.txt\*U file, for example:
-
- NET : 128.9.0.0 : ISI-NET :
-
- Alternatively, the full hosts.txt may be used. The
- .B lookupnames
- program scans the networks list for \*QNET\*U entries, until the
- beginning of first \*QGATEWAY\*U entry or the end of the file.
-
- .IP \fB\-t\fP
- The timeout time for host name lookups; the default is 5 seconds.
- If this timeout expires, the
- .B lookupnames
- program checks the
- networks-file for a matching entry. If none is found, the string
- \*Q(TIMEOUT)\*U is printed in place of the host/network name.
-
- .NH 3
- Count-totals
- .LP
-
- The command
- .B count-totals
- can be used to summarize total packet counts logged by the
- .B collect
- program.
- Taking into account any
- .B statspy
- restarts or
- .B statspy
- totals that were
- cleared, it computes both daily totals and a grand total from a given log
- file. It may be invoked on a list of log files, in which case it
- summarizes each file independently.
-
- The command used to invoke
- .B count-totals
- is:
-
- \fBcount-totals [v=1] \fIlogfile\fR ...
-
- The required \fIlogfile\fP parameter is a list of one or more log files
- produced by
- .B collect
- (and no others).
- The list may contain the usual wildcard specification(s).
- Output is written to a results file named \*Qcount-totals.out\*U,
- as well as to the standard output.
-
- An example is:
-
- count-totals v=1 *IP*
-
- The optional \fBv=1\fP parameter is used to signify that the results should
- be \*Qverbose\*U, which in this case means that a line summarizing each update
- appears in the results file. If the \fBv=1\fP parameter is omitted, only
- daily totals and a grand total are included.
-
- The following is an example of the format of a verbose results file:
-
- .KS
- .nf
- .B1
-
- File: 35.1.1.21-IP.lens.1221.1523
-
- Log created on Tue Dec 22 08:00:25 1987, for host 35.1.1.21.
- Sample interval = 60 min; checkpoint interval = 60 min.
- Object name = 'IP.lens'.
- .ta 1.3i 2.3i 3.3i
-
- Read-Time Clear-Time Total-Count Increment
- --------- ---------- ----------- ---------
- .ta 2.3iR 2.8iR 3.8iR
-
- 08:01:10 12-22 07:06:22 12-22 19262 0
- 09:00:55 12-22 52297 33035
- 10:01:04 12-22 81393 29096
- 11:00:54 12-22 119954 38561
- ... (etc)
- 23:00:54 12-22 468588 12147
- 00:00:55 12-23 493492 24904
- Daily total = 474230 (08:01:10 12-22 to 00:00:55 12-23)
-
- 01:00:54 12-23 515588 22096
- 02:00:54 12-23 542095 26507
- 03:00:54 12-23 566430 24335
- 04:00:54 12-23 586511 20081
- ... (etc)
- 22:00:57 12-23 1043959 7317
- 23:00:55 12-23 1054753 10794
- 00:00:58 12-24 1070114 15361
- Daily total = 576622 (00:00:55 12-23 to 00:00:58 12-24)
- ... (etc)
- ... (etc)
- Daily total = 182408 (00:00:58 12-24 to 14:01:17 12-24)
-
- Grand Total = 1233260 (08:01:10 12-22 to 14:01:17 12-24)
-
- .B2
- .fi
- .KE
-
- The following is an example of the corresponding non-verbose
- format:
-
- .KS
- .nf
- .B1
-
- File: 35.1.1.21-IP.lens.1221.1523
-
- Log created on Tue Dec 22 08:00:25 1987, for host 35.1.1.21.
- Sample interval = 60 min; checkpoint interval = 60 min.
- Object name = 'IP.lens'.
-
- Daily total = 474230 (08:01:10 12-22 to 00:00:55 12-23)
- Daily total = 576622 (00:00:55 12-23 to 00:00:58 12-24)
- Daily total = 182408 (00:00:58 12-24 to 14:01:17 12-24)
-
- Grand Total = 1233260 (08:01:10 12-22 to 14:01:17 12-24)
-
- .B2
- .fi
- .KE
-
-
- In the verbose format, column headings have the following meanings:
-
- .IP o
- \*QRead-Time\*U contains the ReadTime
- returned from each
- .B statspy
- response to a query performed by
- .B collect.
-
- .IP o
- \*QClear-Time\*U is filled in for each new
- ClearTime found in the current log file being processed. A blank
- Clear-Time field signifies that the field is unchanged since the previous
- entry.
-
- .IP o
- \*QTotal-Count\*U corresponds to the \*QTotalCount\*U on each
- response.
-
- .IP o
- \*QIncrement\*U contains the number of packets counted
- between the current response and the previous response.
- .LP
-
- .NH 3
- Bin-totals
- .LP
-
- The command
- .B bin-totals
- produces a summary showing the total number of packets counted in each of the
- corresponding bins appearing in a log file, taking into account any
- .B statspy
- restarts or object clears.
- It can be invoked on a list of log files
- to summarize each independently.
-
- The command to invoke
- .B bin-totals
- is:
-
- \fBbin-totals \fIlogfile\fR ...
-
- As before, the parameter is a list of name(s) of one or more log files produced
- by the
- .B collect
- program, or a wildcard file specification that matches
- (only) files produced by the
- .B collect
- program.
- Output is written to a results file named \*Qbin-totals.out,\*U
- as well as to the standard output.
-
- The following is an example of the output from the
- .B bin-totals
- command:
-
- .KS
- .nf
- .B1
-
- File: 35.1.1.21-IP.lens.1221.1523
-
- Log created on Tue Dec 22 08:00:25 1987, for host 35.1.1.21.
- Sample interval = 60 min; checkpoint interval = 60 min.
- Object name = 'IP.lens'.
-
- Summary Period: 08:01:10 12-22 to 14:01:17 12-24.
-
- [0-9] total = 0
- [10-19] total = 0
- [20-39] total = 1112
- [40-79] total = 557177
- [80-159] total = 149395
- [160-319] total = 194274
- [320-639] total = 61625
- [640-1279] total = 192769
- [1280-2559] total = 76908
- [2560-5119] total = 0
-
- .B2
- .fi
- .KE
-
- .bp
- .SH
- Appendix A -- Catalog of Objects
- .LP
-
- This Appendix describes the statistical object classes currently implemented
- in \fBstatspy\fP.
-
- .SH 2
- Recorder Objects
- .LP
-
- A recorder object is invoked at its Write() entry point to record values
- of a specific field or set of fields.
-
- .IP 1.
- Frequency of all Values
-
- \fBfreq-all\fP
-
- An object of class \fIfreq-all\fP (abbreviated as \fIFA\fP) builds a
- frequency distribution table for a single field. This table is built
- dynamically, with a bin for every distinct value that occurs.
-
- Each time a bin is added or incremented, the current time (in seconds
- since Jan 1, 1970) is recorded in the bin. A read operation returns this
- last-update time with the value and count for each bin. We expect that
- these times will be useful in analysis of the data; for example, it
- would be possible to extract only recently occuring values.
-
- The list of bins returned by a read operation is sorted into order
- of decreasing counts, and within the same count, by last update time.
-
- The implementation of the \fIfreq-all\fP class uses a chained hash
- scheme, dynamically allocating memory for bins in \*Qpages\*U of 2K
- bytes. There is a built-in limit of 1024 bins. In addition to the hash
- chain, each bin is chained into a doubly-linked sorted list that is
- used to order the read sequence. Sorting
- bins into this list is accomplished using an incremental algorithm whose
- CPU time is linear in the total count (and is in fact negligible).
-
- .bp
- .IP 2.
- Frequency of selected set of values
-
- \fBfreq-only ( <value>, ... <value> ) \fP
-
- An object of class \fIfreq-only\fP (abbreviated as \fIFO\fP) builds a
- frequency distribution table for only those values that are included in
- the parameter list; values not in the list are counted in a single
- \*QOther\*U bin. A read operation on the object displays this
- frequency table and the \*QOther\*U count. If the given set of values is
- empty, the \*QOther\*U count equals the total number of invocations.
-
- The <value> entries may be expressed in a variety of ways.
-
- .RS
- .IP o
- Decimal integer, limited to 2**31 maximum.
-
- .IP o
- Hex integer, using the \*QC\*U notation 0x....
-
- .IP o
- IP address, specified as either a dotted-decimal
- number or as a domain name.
-
- .IP o
- Ethernet Address, specified in \*Qcoloned-hex\*U format:
- xx:xx:xx:xx:xx:xx, where each x represents a hex digit.
-
- .IP o
- A quoted enumeration label: \*Q<label>\*U. This implies
- the corresponding value, and
- provides a way to define field values symbolically. See
- Appendix C for examples.
- .RE
- .LP
-
- .bp
- .IP 3.
- Frequency of all Value Pairs
-
- \fBmatrix-all\fP
-
- An object of class \fImatrix-all\fP (abbreviated as \fIMA\fP) builds a
- table of frequencies of all pairs of values in two fields. This table is
- built dynamically, with a bin for every distinct value pair that occurs.
- The pair of values (a,b) is counted separately from the pair (b,a).
- The list of bins returned by a read operation is sorted into
- order of decreasing counts, and within the same count, by last update time.
-
- The internal structure and implementation of this class is the same as the
- \fIfreq-all\fP class, described above.
-
- If the object name matches an enumeration, the corresponding labels
- are used for the first value of each pair.
-
- Note: if a \fImatrix-all\fP object is defined with a non-zero parameter, it
- operates as a \fImatrix-sym\fP object (see following).
-
- .IP 4.
- Symmetric Frequency of Value Pairs
-
- \fBmatrix-sym\fP
-
- An object of class \fImatrix-sym\fP (abbreviated as \fIMS\fP) builds a
- table of frequencies of all pairs of values in two fields. This table is
- built dynamically, with a bin for every distinct value pair that occurs.
- The list of bins returned by a read operation is sorted in order
- of decreasing counts, and within the same count, by last update time.
-
- If the two argument fields have the same length, they are
- treated \*Qsymmetrically\*U: (b,a) and (a,b) are counted in the same bin.
- If the lengths differ, \fImatrix-sym\fP operates like \fImatrix-all\fP.
-
- The internal structure and implementation of this class is the same as the
- \fImatrix-all\fP class, described above.
-
- If the object name matches an enumeration, the corresponding labels
- are used for the first value of each pair.
- .bp
- .IP 5.
- Histogram
-
- \fBhist ( <scale factor> [, <max bin>] )\fP
-
- An object of class \fIhist\fP (abbreviated as \fIHI\fP)
- builds a linear histogram of (unsigned)
- integer field values. Each bin of the histogram has the same size,
- given by the value <scale factor>.
- The optional second parameter specifies the ordinal number of the
- maximum bin that is collected; if it is omitted, 1024 is used.
-
- If <scale factor> is S and
- <max bin> is M, a Read operation on the object defined by hist(S, M)
- returns the counts:
-
- .KS
- Bin 0: Count( 0 <= X < S )
- ...
- Bin j: Count( j*S <= X < (J+1)*S )
- ...
- Bin M: Count( M*S <= X < (M+1)*S )
- .KE
-
- plus a count of values that were off-scale, i.e., >= (M+1)*S.
- Here \*QCount(\ f(X)\ )\*U means the number of invocations with
- value X for which f(X) was true.
-
- The Read operation also reports the average, maximum, and minimum
- values observed.
- A \fIhist\fP object is restricted to invocation on a field of 4 bytes or less.
-
- .IP 6.
- Logarithmic Histogram
-
- \fBhist-pwr2 ( <scale factor> )\fP
-
- An object of class \fIhist-pwr2\fP (abbreviated as \fIP2\fP)
- builds a logarithmic histogram, i.e, one with intervals
- increasing as powers of 2. Specifically, a Read()
- operation on a \fIhist-pwr2\fP object returns the following counts:
-
- .KS
- Bin 0: Count(X < S)
-
- Bin 1: Count(S <= X < 2*S)
- ...
- Bin j: Count(S*(2**j) <= X < S*(2**(j+1)) )
- .KE
-
- where S is the value of the unsigned integer <scale factor>.
-
- A \fIhist-pwr2\fP object also reports the average, maximum, and minimum values
- observed.
- A \fIhist\fP object is restricted to invocation on a field of 4 bytes or less.
-
- .IP 7.
- Measure Temporal Locality of Reference
-
- \fBworking-set\fP
-
- An object of class \fIworking-set\fP (abbreviated as \fIWS\fP) measures the degree of
- temporal clustering of values of a given field. This clustering is
- known as \*Qlocality of reference\*U, and in the memory domain leads to the
- concept of a \fIworking set\fP.
-
- Suppose we keep the set of n
- distinct values that have occurred most recently in
- the field. This set will change over time, as new values occur that were
- not in the set replace the oldest (\*Qleast-recently used\*U) values in the set.
- Let C(n) be the number of new values in the observed sequence that
- replace values already in the set. If there have been a total of N packets
- in the sequence, C(n)/N is the probability of a new value falling outside
- the set.
-
- The \fIworking-set\fP object measures the values of C(1), C(2), C(4), C(8),...
- C(4096).
-
- .IP 8.
- Record Sequence of Values in Binary
-
- \fBbin-pkt ( <max> )\fP
-
- An object of class \fIbin-pkt\fP (abbreviated as \fIBP\fP)
- builds a circular buffer of <max> entries, containing the
- most recent values in a specified field.
- A read operation on the object displays all
- values in this buffer, oldest first.
-
- Although this object may be invoked from any field, it is really intended
- for recording the complete headers (up to 63 bytes) of packets.
- For this purpose, the virtual field \*Qpacket\*U is defined
- (see Section 3.3.1).
- Normally, a \fIbin-pkt\fP object invocation will be under control of
- filter object(s) to record selected packets for diagnostic purposes.
-
- .IP 9.
- Variant of \fIfreq-all\fP
-
- \fBfreq-all2\fP
-
- An object of class \fIfreq-all2\fP (abbreviated as \fIFA2\fP) performs
- the same function as an object of the \fIfreq-all\fP class, except a
- \fIfreq-all2\fP objects does not sort the list of bins, but rather
- displays bins in the order of their first occurrence. As a result,
- \fIfreq-all2\fP objects may use sligntly less CPU time (although the
- difference appears to be negligible) and always use less memory for bins
- (16-20 bytes per bin, compared to 24-28 bytes for \fIfreq-all\fP objects.)
-
- .IP 10.
- Variant of \fImatrix-all\fP
-
- \fB matrix-all2\fP
-
- An object of class \fImatrix-all2\fP (abbreviated as \fIMA2\fP) performs
- the same function as an object of the \fImatrix-all\fP class, except a
- \fImatrix-all2\fP objects does not sort the list of bin, but rather
- displays bins in the order of their first occurrence. As a result,
- \fImatrix-all2\fP objects may use slightly less CPU time (although the
- difference appears to be negligible) and always use less memory for bins
- (16-24 bytes per bin, compared to 24-32 bytes for \fImatrix-all\fP
- objects.)
-
- .bp
- .SH 2
- Filter Objects
- .LP
-
- A filter object tests given field values against some criterion and
- returns a Boolean value; the interpreter uses this result to select one
- of two alternative sequences of invocations.
-
- A filter object generally has a read-only data structure, but it does
- keep two statistical counters: the total number of invocations, and
- the number that resulted in a TRUE result. These two numbers are
- returned by a Read operation.
-
- .IP 1.
- Filter on range of values
-
- \fBrangef ( <Lower>, <Upper>)\fP
-
- A \fIrangef\fP object (abbreviated as \fIRF\fP)
- returns TRUE if the given value X falls inside the
- specified range:
-
- L <= X <= U,
-
- otherwise it returns FALSE.
- Here L and U are the unsigned integer values corresponding to <Lower>
- and <Upper>, respectively.
-
- .mc |
- .IP 2.
- Filter on equality
-
- \fBeqf( <value> )\fP
-
- An \fIeqf\fP object (abbreviated as \fIEQ\fP)
- returns TRUE if the given field value matches the
- specified parameter value, otherwise it returns false.
- Here <value> may take any of the forms described earlier for
- the \fIfreq-only\fP class.
-
- .mc
- .IP 3
- Filter on selected set of values
-
- \fBsetf ( <value>, ... <value> )\fP
-
- A \fIsetf\fP object (abbreviated as \fISF\fP)
- returns TRUE if the given field value matches one of the
- values in the parameter list, otherwise it returns FALSE.
- Each <value> may take any of the forms described earlier for
- the \fIfreq-only\fP class.
-
- .mc |
- Note that setf with a single value is equivalent to eqf, but is
- much less efficient.
- .LP
-
- .mc
- .bp
- .SH
- Appendix B -- Syntax of Attach Command
- .LP
-
- This Appendix contains a BNF specification of the \fIattach\fP command syntax.
-
- The syntax of Attach parameters has been (deliberately) designed to parallel
- the syntax of \*QC\*U statements (that correspond to simple invocations) and
- statement-lists (that correspond to lists of invocations).
-
-
- <Attach command> ::= attach { <S-list> }
-
-
- <S-list> ::= <Statement> | <S-list> <Statement>
-
-
- <Statement> ::= record <record-invoke> ; |
-
- if <if-invoke> <Statement> else <Statement> |
-
- if <if-invoke> <Statement> |
-
- { <S-list> } | ;
-
-
- <record-invoke> ::= <field name> in <object defn> |
-
- <field name>, <field name> in <object defn>
-
-
- <if-invoke> ::= <field name> is <object defn> |
-
- <field name> isnot <object defn>
-
-
- <object defn> ::= <object name> <class> <class parm> |
-
- <class> <class parm> |
-
- <object name>
-
-
- <class parm> ::= <empty> | ( <value list> )
-
- <value list> ::= <empty> | <value list> <value>
-
- <value> ::= <decimal integer> |
-
- 0x<hex number> | 0X<hex number> |
-
- <IP address> |
-
- <Ethernet address> |
-
- \*Q<label>\*U
-
- <IP address> ::=
-
- <dotted-decimal number> |
-
- <host domain name>
-
- <Ethernet address> ::=
-
- <hex digit>:<hex digit>:<hex digit>:
- <hex digit>:<hex digit>:<hex digit>
-
- <hex number> ::= <hex digit> | <hex number><hex digit>
-
- <hex digit> ::= 00 | 01 | ... | fe | ff
-
- <field name> ::= <identifier>
-
- <object name> ::= <identifier>
-
- <identifier> ::= a letter, followed by: any string of
- letters, digits, or any of the
- special characters +-&._
-
- .bp
- .SH
- Appendix C -- Building Configuration Files
- .LP
-
- This Appendix provides some guidelines, suggestions, and examples for building
- .B statspy
- configurations.
-
- .IP "Example A."
-
- Traffic flowing to and from a particular gateway can be selected
- with a filter:
-
- attach {
- if Ether.dst is eqf(08:00:2b:03:4a:e7) {
- ####going to gateway
- record ... ;
- record ... ;
- ...
- }
- if Ether.src is eqf(08:00:2b:03:4a:e7) {
- ####coming from gateway
- record ... ;
- record ... ;
- ...
- }
- }
-
- where 08:00:2b:03:4a:e7 is the Ethernet address of the gateway.
- (Note the convention for Ethernet addresses: \*Qcoloned-hex\*U).
-
- .IP "Example B."
-
- Another filtering approach is to select packets by classes of IP
- addresses. For example, suppose it is known that the list 128.1.0.0,
- 128.2.0.0, and 128.3.0.0 includes all local networks. In that case, the
- following selects \*Qtransit\*U packets, i.e., packets whose source and
- destination are both outside the local administrative area:
-
- if IP.srcnet isnot
- setf(128.1.0.0, 128.2.0.0, 128.3.0.0)
- if IP.dstnet isnot
- setf(128.1.0.0, 128.2.0.0, 128.3.0.0) {
- record ... ;
- record ... ;
- ...
- }
-
- .B Statspy
- has been designed to be efficient even if the list of values
- used as parameters to \fIsetf\fP() is very large (say, several hundred
- values) -- a \fIsetf\fP object uses a hash-table.
-
- .IP "Example C."
-
- Suppose the problem is to collect data on all TCP traffic destined
- for a particular gateway, broken down by source and destination IP addresses
- as well as packet type (Telnet, FTP, etc.).
-
- Here, \*Qpacket type\*U is a little bit hazy, but it is related to the
- occurrence of a well-known port number in the TCP source or destination
- port.
-
- In principle, there is no reason why
- .B statspy
- could not provide for three-way distributions, but it does not. One of
- the main reasons (besides distaste for the resulting messiness in the
- specifications and code) for not implementing three-way distributions is
- disbelief that administrators will want all that data! Running
- .B statspy
- 12 hours on a typical large Ethernet has found packets to/from 100
- networks and 500 different IP hosts. The complete three-way matrix asked
- for here may therefore contain 10**6 bins. It seems unlikely that anyone
- will have use for a million numbers, accumulated over days, weeks, and
- months!
-
- In fact, it seems doubtful that even the hardiest administrator will
- really want to keep complete statistics by (IP source, IP destination)
- host pairs; after a few months of looking at 10**5 numbers, he/she will
- tire of it and begin to collect data only on source and destination
- network, or only for specific subsets of networks.
-
- The
- .B statspy
- design includes a number of features to contain the
- amount of data it can generat, for example the inclusion of IP network
- numbers distinct from host numbers, the conditional (filter)
- mechanism, and the \fIsetf\fP() filter described above.
-
- The recommended approach to using NNStat is as follows: set up some simple,
- general overall statistical measures, producing a volume of data that
- can reasonably be scanned. If some apparent anomalies are observed --
- e.g., a
- particular network seems to be producing more packets than expected --
- then augment the configuration for 24 hours with specific
- objects to analyze exactly those anomalous data.
-
- In any case, the following command sets up a configuration to
- provide counts broken down by source address, destination address,
- and packet type.
-
- .KS
- .nf
- attach {
-
- if TCP.dstport is setf(23, 43, 79, 513)
- record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
- else if TCP.srcport is setf(23, 43, 79, 513)
- record IP.srchost, IP.dsthost in Telnet.hosts;
-
- else if TCP.dstport is setf(20, 21, 69)
- record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
- else if TCP.srcport is setf(20, 21, 69)
- record IP.srchost IP.dsthost in ftp.hosts;
-
- else if TCP.dstport is setf(25, 103, 104, 119)
- record IP.srchost IP.dsthost in mail.hosts matrix-sym;
- else if TCP.srcport is setf(25, 103, 104, 119)
- record IP.srchost IP.dsthost in mail.hosts;
- }
- .KE
- .fi
-
- .mc |
- We can avoid replicating the parameter lists to the setf objects by
- naming the first occurrence of each case and referencing the same
- object in later occurrences, as shown by following:
-
- .KS
- .nf
- attach {
-
- if TCP.dstport is port.telnet setf(23, 43, 79, 513)
- record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
- else if TCP.srcport is port.telnet
- record IP.srchost, IP.dsthost in Telnet.hosts;
-
- else if TCP.dstport is port.ftp setf(20, 21, 69)
- record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
- else if TCP.srcport is port.ftp
- record IP.srchost IP.dsthost in ftp.hosts;
-
- else if TCP.dstport is port.mail setf(25, 103, 104, 119)
- record IP.srchost IP.dsthost in mail.hosts matrix-sym;
- else if TCP.srcport is port.mail
- record IP.srchost IP.dsthost in mail.hosts;
- }
- .KE
- .fi
-
- Using symbolic label defined by an \fIenum\fP command, we can write this as:
-
- .KS
- .nf
- enum {
- *port* (20 \*QFTP data\*U, 21 FTP, 23 Telnet, 25 SMTP,
- 37 Time, 42 Name, 43 Whois, 53 Domains,
- 69 TFTP, 79 Finger, 103 X.400, 104 \*QX.400-SND\*U,
- 109 POP2, 111 sunrpc, 115 SFTP, 119 NetNews,
- 153 SGMP, 512 exec, 513 \*Qrwho|rlogin\*U, 514 shell,
- 515 printer, 520 RIP)
- }
-
- attach {
- if TCP.dstport is port.telnet
- setf(\*QTelnet\*U, \*QWhois\*U, \*QFinger\*U, \*Qrwho|rlogin\*U)
- record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
-
- else if TCP.srcport is port.telnet
- record IP.srchost, IP.dsthost in Telnet.hosts;
-
- else if TCP.dstport is port.ftp setf(\*QFTP data\*U, \*QFTP\*U, \*QTFTP\*U)
- record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
-
- else if TCP.srcport is port.ftp
- record IP.srchost IP.dsthost in ftp.hosts;
-
- else if TCP.dstport is port.mail
- setf(\*QSMTP\*U, \*QX.400\*U, \*QX.400-SND\*U, \*QNetNews\*U)
- record IP.srchost IP.dsthost in mail.hosts matrix-sym;
-
- else if TCP.srcport is port.mail
- record IP.srchost IP.dsthost in mail.hosts;
- }
- .KE
- .fi
-
- .mc
- Now, all this needs to be conditional upon packets coming and going
- through a specific gateway. This requires a very redundant configuration
- file, but the good news is that the redundancy does not effect either the
- CPU time or memory space required for data collection.
- Assuming the \fIenum\fP command of the previous example, the
- complete \fIattach\fP command can be written as:
- .mc |
- .KS
- .nf
- attach {
-
- if Ether.src is eqf(08:00:2b:03:4a:e7) {
- if TCP.dstport is port.telnet
- setf(\*QTelnet\*U, \*QWhois\*U, \*QFinger\*U, \*Qrwho|rlogin\*U)
- record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
-
- else if TCP.srcport is port.telnet
- record IP.srchost, IP.dsthost in Telnet.hosts;
-
- else if TCP.dstport is port.ftp setf(\*QFTP data\*U, \*QFTP\*U, \*QTFTP\*U)
- record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
-
- else if TCP.srcport is port.ftp
- record IP.srchost IP.dsthost in ftp.hosts;
-
- else if TCP.dstport is port.mail
- setf(\*QSMTP\*U, \*QX.400\*U, \*QX.400-SND\*U, \*QNetNews\*U)
- record IP.srchost IP.dsthost in mail.hosts matrix-sym;
-
- else if TCP.srcport is port.mail
- record IP.srchost IP.dsthost in mail.hosts;
- }
- if Ether.dst is eqf(08:00:2b:03:4a:e7) {
- if TCP.dstport is port.telnet
- record IP.srchost, IP.dsthost in Telnet.hosts matrix-sym;
-
- else if TCP.srcport is port.telnet
- record IP.srchost, IP.dsthost in Telnet.hosts;
-
- else if TCP.dstport is port.ftp
- record IP.srchost IP.dsthost in ftp.hosts matrix-sym;
-
- else if TCP.srcport is port.ftp
- record IP.srchost IP.dsthost in ftp.hosts;
-
- else if TCP.dstport is port.mail
- record IP.srchost IP.dsthost in mail.hosts matrix-sym;
-
- else if TCP.srcport is port.mail
- record IP.srchost IP.dsthost in mail.hosts;
- }
- }
- .KE
- .fi
- .mc
-
-
-
-